Transformer based named entity recognition for place name extraction from unstructured text

نویسندگان

چکیده

Place names embedded in online natural language text present a useful source of geographic information. Despite this, many methods for the extraction place from use pre-trained models that were not explicitly designed this task. Our paper builds five custom-built Named Entity Recognition (NER) and evaluates them against three popular pre-built name extraction. The are evaluated using set manually annotated Wikipedia articles with reference to F1 score metric. best performing model achieves an 0.939 compared 0.730 model. is then used extract all Great Britain, demonstrating ability more accurately capture unknown volunteered sources

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Named Entity Recognition in Persian Text using Deep Learning

Named entities recognition is a fundamental task in the field of natural language processing. It is also known as a subset of information extraction. The process of recognizing named entities aims at finding proper nouns in the text and classifying them into predetermined classes such as names of people, organizations, and places. In this paper, we propose a named entity recognizer which benefi...

متن کامل

Concurrent Entity Recognition and Relationship Extraction from Unstructured Text

Entity recognition and entity relationship extraction are two very important tasks in information extraction. This paper proposes a new method for performing entity recognition and entity relationship extraction concurrently from unstructured text based on Conditional Random Fields (CRFs). This method makes use of entity features, entity relationship features and features of triples which is co...

متن کامل

Entity-Relationship Extraction from Wikipedia Unstructured Text

Wikipedia has been the primary source of information for many automatically-generated Semantic Web data sources. However, they suffer from incompleteness since they largely do not cover information contained in the unstructured texts of Wikipedia. Our goal is to extract structured entity-relationships in RDF from such unstructured texts, ultimately using them to enrich existing data sources. Ou...

متن کامل

Named Entity Recognition from Diverse Text Types

Current research in Information Extraction tends to be focused on application-specific systems tailored to a particular domain. The Muse system is a multi-purpose Named Entity recognition system which aims to reduce the need for costly and time-consuming adaptation of systems to new applications, with its capability for processing texts from widely differing domains and genres. Although the sys...

متن کامل

Accurate Unsupervised Joint Named-Entity Extraction from Unaligned Parallel Text

We present a new approach to named-entity recognition that jointly learns to identify named-entities in parallel text. The system generates seed candidates through local, cross-language edit likelihood and then bootstraps to make broad predictions across both languages, optimizing combined contextual, word-shape and alignment models. It is completely unsupervised, with no manually labeled items...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: International Journal of Geographical Information Science

سال: 2022

ISSN: ['1365-8824', '1365-8816']

DOI: https://doi.org/10.1080/13658816.2022.2133125